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Abstract — Compressed Sensing aims to capture attributes of a 
sparse signal using very few measurements. Candes and Tao 
showed that sparse reconstruction is possible if the sensing 
matrix acts as a near isometry on all fc -sparse signals. This 
property holds with overwhelming probability if the entries of 
the matrix are generated by an iid Gaussian or Bernoulli process. 
There has been significant recent interest in an alternative 
signal processing framework; exploiting deterministic sensing 
matrices that with overwhelming probability act as a near 
isometry on fc-sparse vectors with uniformly random support, 
a geometric condition that is called the Statistical Restricted 
Isometry Property or StRIP. This paper considers a family 
of deterministic sensing matrices satisfying the StRIP that are 
based on Delsarte-Goethals Codes codes (binary chirps) and a 
fc-sparse reconstruction algorithm with sublinear complexity. In 
the presence of stochastic noise in the data domain, this paper 
derives bounds on the £2 accuracy of approximation in terms 
of the £2 norm of the measurement noise and the accuracy of 
the best fc-sparse approximation, also measured in the £2 norm. 
This type of £2/ £2 bound is tighter than the standard £2/ 1 £\ or 
£i/£i bounds. 

I. Introduction 

The central goal of compressed sensing is to capture at- 
tributes of a signal using very few measurements. In most 
work to date, this broader objective is exemplified by the 
important special case in which a k-sparse vector a. in Br 
with C large is to be reconstructed from a small number N of 
linear measurements with k < N <C C. In this problem, the 
measurement data is a vector / = $a, where $ is an N x C 
matrix called the sensing matrix. 

The work of Donoho [1] and of Candes, Romberg and Tao 
[2] provides fundamental insight into the geometry of sensing 
matrices. The Restricted Isometry Property (RIP) formulated 
by Candes and Tao [3] is that the sensing matrix acts as 
a near isometry on all fc-sparse vectors, and this condition 
is sufficient for sparse reconstruction. There are two broad 
families of reconstruction algorithms, those based on convex 
optimization and those based on greedy iteration. The basis 
pursuit algorithms try to find the sparse approximation by 
relaxing the non-convex Iq loss to a convex optimization 
task such as l\ minimization, and LASSO [2]. The Matching 
Pursuit algorithms [4]-[6] on the other hand try to solve 
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the recovery problem iteratively. At each iteration, one or a 
list of coordinates is selected greedily to provide the best 
approximation to the vector in the measurement domain. The 
vector in the measurement domain is then updated accordingly 
at the end of each iteration. Adjacency matrices of expander 
graphs have been shown to provide similar performance [7]- 
[9]. 

One disadvantage of these Basis Pursuit and Matching 
Pursuit algorithms is that computational complexity is super- 
linear in the dimension of the data domain, which is typically 
very large if k <C C. In this paper, focusing on average case 
performance, we propose and analyze a Chirp Reconstruction 
Algorithm that reconstructs a fc-sparse vector iteratively by 
forming the power spectrum of the measured superposition. 
By contrast the complexity of Chirp Reconstruction depends 
only on the sparsity level k and the number of measurements 
7Y. A second disadvantage is that even though reconstructing 
a fc-sparse signal in the presence of noise in the data-domain 
is a fundamentally important problem, bounds on the accuracy 
of approximation of BP and MP algorithms are not very tight. 
Let a.k be ex restricted to its fc most significant entries, p be the 
noise vector, and a* be the output of the recovery algorithm. 
An algorithm is said to provide t v jl q recovery guarantees if 

||a - a*||j, < Ci(fc)||a - a k \\ q + C 2 \\p\\ P - 

The sparse reconstruction algorithms that use random dense 
matrices provide £2/^1 guarantees, and the expander-based 
reconstruction algorithms provide £i/£i guarantees. The rea- 
son again goes to the worst-case vs stochastic modeling of 
the noise in the data domain. A result by Cohen et. al [10] 
shows that no reconstruction algorithm can provide £2/^2 
reconstruction guarantees unless N = il(C). Nevertheless, 
we show that if the signal consists of fc significant entries 
covered by C iid Gaussian noise, which is the case for many 
compressed sensing applications, it is possible to derive £2 1 £2 
guarantees. 

Calderbank et al. [11] have considered deterministic sensing 
matrices that with overwhelming probability act as a near 
isometry on fc-sparse vectors, and we refer to this geometric 
property as the Statistical Restricted Isometry Property: 

Definition 1. ((fc, e, <$)-StRIP matrix) An N x C (sensing) 
matrix <3> is said to be a (fc, e, (5)-STRIP, if for fc-sparse vectors 



Q£l c , the inequalities 

N{l~e)\\a\\ 2 < ||$a|| 2 < iV(l + e)||a| 



(1) 



hold with probability exceeding 1 — 5 (with respect to a 
uniform distribution of the vectors a among all fc-sparse 
vectors in M c of the same norm). 

The framework includes sensing matrices for which the 
columns are discrete chirps either in the standard Fourier 
domain [12] or the Walsh-Hadamard domain [13]. 

Chirp Reconstruction is similar to Matching Pursuit in 
that at each iteration it identifies a significant component of 
the fc-sparse signal. The overall computational complexity of 
Chirp Reconstruction applied to Reed Muller sensing matrices 
is 0(fciVlog 2 iV). The StRIP property of the Reed Muller 
sensing matrices makes it possible to accurately recover the 
coefficients of the fc significant components leading to robust 
recovery guarantees in the presence of noise both in the data 
and in the measurement domains. These guarantees apply with 
overwhelming probability to the class of approximately fc- 
sparse signals. 

II. Delsarte-Goethals Codes 

Here to is odd, the rows of the sensing matrix $ are indexed 
by binary m-tuples x, and the columns are indexed by pairs 
P, b, where P is an to x to binary symmetric matrix and b is 
a binary m-tuple. The entry ipp^{x) is given by 



tp pb ( x ) = i^t(d P )+2ivt(b) i xPx T +2bx T 



(2) 



where d p denotes the main diagonal of P, and wt denotes the 
Hamming weight( the number of Is in the binary vector). 

The Delsarte-Goethals set DG(m, r) is a binary vector 
space containing 2( r+1 ) m binary symmetric matrices with the 
property that the difference of any two distinct matrices has 
rank at least m — 2r (See [14]). The Delsarte-Goethals sets 
are nested: 



DG(m,0) c DG(m, 1) C • • ■ C DG(m, 



(m — 1) 



)■ 



The first set DG(m, 0) is the classical Kerdock set, and 
the last set DG(m, ( m - 1 )/2) is the set of all binary symmetric 
matrices. The rth Delsarte-Goethals sensing matrix is deter- 
mined by DG{m, r) and has N = 2 m rows and C = 2 ( - r+2 > n 
columns, and the column sums in the r th Delsarte-Goethals 
sensing matrix satisfy 

2 



or N 2 t l m for some t G {to— 2r, • ■ • , to}. 



(3) 

We will use the following lemmas which characterize the 
properties of the Delsarte-Goethals matrices. For detailed 
proofs see [11]. 

Lemma 1. Let Q = Q{m, r) be the set of column vectors <fip,b 
where 



where b G F™ and where the binary symmetric matrix P varies 
over the Delsarte-Goethals set DG{m, r). Then Q is a group 
of order 2( r + 2 ) m under pointwise multiplication. 

The following Theorem has been proved by Calderbank 
et.al. 

Theorem 2. Suppose the N x C matrix $ is derived from a 
DG(m,r) family, and let t] = 1 — 2r /m. Then for any fc, e 
with k < 1 + (C - l)e, $ is (k,e,S)-StRIP with 5 := 



2 cxp 



[e-(k-l)/{C-l)] 2 N" 
32 k 



III. The Chirp Reconstruction Algorithm 

In this section we introduce the Chirp Reconstruction Algo- 
rithm, used for the purpose of efficient sparse reconstruction 
in the presence of noise. Let tt = {tti, ■ ■ ■ , ire} be a random 
permutation of {1, • • • ,C}, and let a. be an almost fc-sparse 
vector whose fc significant entries are positioned according 
to {tti,--- ,TTk}- Let a.k be ol restricted to its best fc-term 
approximation. Calderbank et.al. showed that if $ is (fc, e, 8) 
StRIP, then with probability 1-5, 



|$(a - ajb)|| 2 < ||a - a fc ||i. 



(4) 



Furthermore, if we assumed that a is exactly fc-sparse en- 
compassed with C iid white noise with variance g^, then 
since the rows of <f> form a tight-frame with redundancy c /n, 
it follows that noise samples on distinct measurements are 
independent gaussian, with variance C(7 c/n. As a result, using 
the concentration bounds for \ 2 distribution, it follows that 
with overwhelming probability 



^=$(a - Q£fc)|| 2 < \\ot - a k h 
v iV 



(5) 



Let \i be the noise in the measurement domain. Then 
compressive sensing using the matrix maps a vector 



a. to 



$o: + fi = y + v, 



where y — ^=$a!fc, and v 



- OL k ) + (i. The 

goal is then to approximate otk from /. The chirp recon- 
struction algorithm [12], [13] is a repurposing of the chirp 
detection algorithm commonly used in navigation radars which 
is known to work extremely well in the presence of noise, 
and is described as Algorithm 1. At each iteration t, given 
the residual measurement vector f t , first the autocorrelation 
function is applied to f t , i.e f t is pointwise multiplied with 
a shifted version of itself. Then applying the fast Hadamard 
transform forms the power spectrum of f t , which as we will 
show, consists of fc tones corresponding to the position of the 
fc significant entries of a, and a noise term uniformly spread 
across all Hadamard coefficients, which accounts for the noise 
v, and chirp like cross-terms. In other words, since the sens- 
ing matrix is obtained by exponentiating quadratic functions, 
forming the power spectrum produces a sparse superposition 
of pure frequencies (in the example below, these are Walsh 
functions in the binary domain) against a background of chirp- 
like cross terms. The algorithm then iteratively learns the 



terms in the sparse superposition by varying the offset a. 
These terms can be peeled off in decreasing order of signal 
strength or processed in a list. Experimental results show close 
approach to the information theoretic lower bound on the 
required number of measurements [13]. 

Algorithm 1 Chirp Reconstruction Algorithm 
Input: N dimensional vector f 1 = -j=<frotk + v, Out- 
put: An approximation at* to the fc-sparse signal otk 



for t = 1, • ■ ■ , k or while ||/*||2 > e do 
for j = 1, ■ • ■ , m do 

Let a,j be the jth standard basis vector. Using aj 
pointwise multiply f t with its shifted vector. 
4: Compute the fast Walsh-Hadamard transform of the 

computed auto-correlation: Equation (8). 
5: Find the position of the next peak Itj in the 
Hadamard domain. Decode the next row of the j th 
row of P Vt . 
6: end for 

7: Pointwise multiply /* with i xP ^t x f and find the cor- 
responding value b Vt , by finding the next peak in the 
power spectrum. 

8: Determine the corresponding value a* which mini- 
mizes \\y/Nf - an t <PP„ t ,b„ t || 2 . 

9: Set/ t + 1 =/ t -a+ mtife , t . 

10: end for 

11: Let <&_fc be <E> restricted to the recovered k columns. 
Output a* = argmin \\-j=$ v ka. — f\\ 2 . 

The first step is pointwise multiplication of the sparse 
superposition with a shifted copy of itself, which gives 



y(x+a)y(x)+is(x+a)is(x)+y(x+a)i>(x)+i>(x+a)y(x) (6) 

By Cauchy-Schwartz inequality and StRIP propery, it is easy 
to verify that the total energy of the last three terms in 
(6) is bounded by 3||^|| 2 ||o:fc|| 2 . The first term itself can be 

c , and 



decomposed into pure tones | « j | 2 ( — 1 ) ° 

chirp terms 



(7) 



Then the (fast) Hadamard transform concentrates the energy 
associated with pure tones into (at most) k Walsh-Hadamard 
tones with energies |<x, | 4 . This algorithm may get into trouble 
when two of the pure tones fall into the same basis. This 
problem can be resolved to a large extent by varying the offset 
a [13]. In the next section, we show that the the the fast 
Hadamard transform distributes the energy of Equation (7) 
uniformly across all N tones in the fast Hadamard domain. 
Moreover, by Azuma's inequality, it is easy to verify that the 
total energy of the chirps terms (Equation (7)) is with high- 
probability at most 2 ^'^jJ"'^" J ^ • The impact of reducing the 
signal strength in the k concentrated peaks which does not 



make a problem in detecting the largest peak in the presence 
of sufficiently large SNR. 

IV. Analysis of the Algorithm 
The I th Fourier coefficient of the term (7) is 

(8) 

In this section we show that with over whe lming probability, 
for all Fourier coefficients I, \T l a \ < J -^-j|c*fcj| 2 , where the 
probability is with respect to the permutation ti. We show this 
by a probabilistic argument. First we show that E w [|r^|] =0, 
and then by constructing an appropriate martingale sequence, 
and applying the Azuma's inequality we show that \T l a \ is 
highly concentrated around its expectation. 

Let T be the set of all fc-tuples (ti,--- , tk), such that 
{ti, ■ ■ ■ , tc} is a permutation of {1, ■ • ■ ,C}. For all distinct 
i,j in {1, • ■ • , k}, and (fx, ■ ■ ■ ,£/.) in T define 

HU,t J ) = Y,(-lY xT VP tl ,b tl (x + a)ip Pt ., bt .{x), (9) 



and 



Tiih,--- ,t k ) = — T y^aiajhfat,), (10) 



Then (8) can be written as r„(7ri, • • • , 7Tfc). We first show that 
^[^(Trx,-.- ,7T k )\] =0. 

Lemma 3. Let Q be the group of columns of $ with respect 
to pointwise multiplication. The map Q xQ — > {±1, ±i} given 
by (g,h) — ► g(x + a)h~ (x) is a surjective homomorphism, 
and 

^g(x + a)h~ l {x) = - ]T g( x + a)g~ X {x). 
g^h g 

Pr°°f : T, g , h 9(x + a)h~\x) =0. ■ 

Lemma 4. E„ [T e a (nj\ is zero. 
Proof: We can rewrite 

x 

in the form 



C(C - 1) 



(11) 



g^h 



The initial factor is just the frequency with which any admis- 
sible pair is chosen, and the second sum is taken over the 
column group Q. Lemma 3 allows us to rewrite (11) as 

x g 
^ L > P X b 

where the outer sum is taken over all binary symmetric 
matrices in the Delsarte-Goethals Codes ensembles. Since 



a ^ 0, the sum J2b(~ -0 = is always zero ■ 

Theorem 5. Let ir be a random permutation of {1, • ■ ■ , C}. 
Then with probability at least 1 — 6 for any coefficient I we 
have 



rS(7ri.---,7r fc )< 



'8fclog 



. S , 



\a\ 



(13) 



]\[l-r/m 

Proof: Define the martingale sequence Z\ , • ■ ■ , as 

Zi=^ [r£(7ri,--- ,7T k )\ 7T1,.-- ,71,] , (14) 

and denote tt| = (7Ti, • ■ ■ , 7Tj). Since the columns of $ form a 
group under pointwise multiplication, using Equation (3) we 
get 



< 



supE. [r|(4) I k\-\u] - inf E w [ri(7rf ] 



/v- 



M] 

(15) 



Note that by Cauchy-Schwartz inequality 



Consequently, by applying Azuma's inequality we get 

Pr [Tiim,--- ,7r fc ) > e] < exp 



Applying the union bounds on all N possible choices of I 
completes the proof. ■ 
Consequently, the chirp-like terms have uniform distribution 
across all N tones in the fast hadamard domain. Consequently, 
if k -C C, and the SNR is sufficiently large, it is possible to 
iteratively recover the positions of the k significant entries 
of the vector a. Having recovered the support tt\ of a k , it 
is possible to reconstruct a better approximation for ot k by 
minimizing || — 5=<£> - fc — /|| 2 , which has the analytical solution 



(16) 



The following bound on the approximation error of a* then 
follows from the StRIP property. 

Theorem 6. Let $ be (k,e,6)-StRIP. Let a be an almost k- 
sparse vector such that a. k has a uniformly random support 
{7Ti, ••■ ,7Tfe}. Let a* defined by Equation (16). Then with 
probability 1 — 6, 



\at - ctkh < 



1 



|$(a - a fe )|| 2 + ||/i|| 2 



Proof: Since $ is (fc, e, <5)-StRIP, and a k and a* are 
two fc-sparse vectors with the same random support, with 
probability 1-5, (1 - e)\\a* - a. k \\ 2 < ^=||$(&* - a fc )|| 2 . 
By the triangle inequality 

'!*(&• - a*)|| a < ||^=$«* ~fh+ \Wh- 



On the other hand, by definition of a* we have 

!=*a* - fh < \\-^® a >< - fh < \Wh- 



Putting all together, and recalling that 

IIHI2 < -^=||$(a-a fe )|| 2 + ||/x|| 2 



Completes the proof. ■ 
As a result, it follows from Equation (4), that with proba- 
bility at least 1 - 26, 



lot - a.k\\2 < 



1 



\a - «fc||i + \\fj,\\ 2 



(1-e) WN' 

and furthermore, considering Equation (5), if the signal in the 
data domain consists of fc-significant entries covered by white 
noise, then with overwhelming probability 

2 



a - atkh < 



(1-e) 
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